Parallel univariate decision trees

نویسندگان

  • Olcay Taner Yildiz
  • Onur Dikmen
چکیده

Univariate decision tree algorithms are widely used in Data Mining because (i) they are easy to learn (ii) when trained they can be expressed in rule based manner. In several applications mainly including Data Mining, the dataset to be learned is very large. In those cases it is highly desirable to construct univariate decision trees in reasonable time. This may be accomplished by parallelizing univariate decision tree algorithms. In this paper, we first present two different univariate decision tree algorithms C4.5 and univariate Linear Discriminant Tree. We show how to parallelize these algorithms in three ways: (i) feature based, (ii) node based (iii) data based manners. Experimental results show that performance of the parallelizations highly depend on the dataset and the node based parallelization demonstrate good speedups.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global Induction of Decision Trees

Decision trees are, besides decision rules, one of the most popular forms of knowledge representation in Knowledge Discovery in Databases process (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996) and implementations of the classical decision tree induction algorithms are included in the majority of data mining systems. A hierarchical structure of a tree-based classifier, where appropriate t...

متن کامل

Global Induction of Decision Trees: From Parallel Implementation to Distributed Evolution

In most of data mining systems decision trees are induced in a top-down manner. This greedy method is fast but can fail for certain classification problems. As an alternative a global approach based on evolutionary algorithms (EAs) can be applied. We developed Global Decision Tree (GDT) system, which learns a tree structure and tests in one run of the EA. Specialized genetic operators are used,...

متن کامل

A framework for bottom-up induction of oblique decision trees

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes...

متن کامل

Discovery of Relevant New Features by Generating Non-Linear Decision Trees

field of manufacturing new features. Most decision tree algorithms using selective induction focus on univariate, i.e. axis-parallel tests at each internal node of a tree. Oblique decision trees use multivariate linear tests at each non-leaf node. One well-known limitation of selective induction algorithms, however, is its inadequate description of hypotheses by task-supplied original features....

متن کامل

On the VC-Dimension of Univariate Decision Trees

In this paper, we give and prove lower bounds of the VC-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. In our previous work (Aslan et al., 2009), we proposed a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively. Using...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2007